skip to main content


Search for: All records

Creators/Authors contains: "Xie, C"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. https://www.nsta.org/connected-science-learning/connected-science-learning-march-april-2022/data-driven-science-vlogging 
    more » « less
  2. Abstract

    Design thinking is essential to the success of a design process as it helps achieve the design goal by guiding design decision-making. Therefore, fundamentally understanding design thinking is vital for improving design methods, tools and theories. However, interpreting design thinking is challenging because it is a cognitive process that is hidden and intangible. In this paper, we represent design thinking as an intermediate layer between human designers’ thought processes and their design behaviors. To do so, this paper first identifies five design behaviors based on the current design theories. These behaviors include design action preference, one-step sequential behavior, contextual behavior, long-term sequential behavior, and reflective thinking behavior. Next, we develop computational methods to characterize each of the design behaviors. Particularly, we use design action distribution, first-order Markov chain, Doc2Vec, bi-directional LSTM autoencoder, and time gap distribution to characterize the five design behaviors. The characterization of the design behaviors through embedding techniques is essentially a latent representation of the design thinking, and we refer to it as design embeddings. After obtaining the embedding, an X-mean clustering algorithm is adopted to each of the embeddings to cluster designers. The approach is applied to data collected from a high school solar system design challenge. The clustering results show that designers follow several design patterns according to the corresponding behavior, which corroborates the effectiveness of using design embedding for design behavior clustering. The extraction of design embedding based on the proposed approach can be useful in other design research, such as inferring design decisions, predicting design performance, and identifying design actions identification.

     
    more » « less
  3. Engineering design is often used to teach science, but not-yet leads to solid learning gains. We examined the relationship between science learning and engineering design using text mining. Association rule mining was applied to texts written during design to extract the relationships between solar-energy concepts and solar design performance. Findings suggest that students test concept-related factors’ effects on design outcomes to learn concepts and eliminate misconceptions. These findings have implications for future instructional design. 
    more » « less
  4. Host-managed shingled magnetic recording drives (HMSMR) give a capacity advantage to harness the explosive growth of data. Applications where data is sequentially written and randomly read, such as key-value stores based on Log-Structured Merge Trees (LSM-trees), make the HMSMR an ideal solution due to its capacity, predictable performance, and economical cost. However, building an LSMtree based KV store on HM-SMR drives presents severe challenges in maintaining the performance and space efficiency due to the redundant cleaning processes for applications and storage devices (i.e., compaction and garbage collections). To eliminate the overhead of on-disk garbage collections (GC) and improve compaction efficiency, this paper presents GearDB, a GC-free KV store tailored for HMSMR drives. GearDB proposes three new techniques: a new on-disk data layout, compaction windows, and a novel gear compaction algorithm. We implement and evaluate GearDB with LevelDB on a real HM-SMR drive. Our extensive experiments have shown that GearDB achieves both good performance and space efficiency, i.e., on average 1:71 faster than LevelDB in random write with a space efficiency of 89.9%. 
    more » « less
  5. MLC NAND flash memory uses the voltages of the memory cells to represent bits. High voltages cause much more damage on the cells than low voltages. The free space that need not store bits is leveraged to reduce the usage of those high voltages and thus extend the lifetime of the MLC memory. However, limited by the conventional data representation rule that represents bits by the voltage of one single cell, the high voltages are still used in a high probability. To fully explore the potential of the free space on reducing the usage of high voltages, we propose a novel data representation aware of damage, named DREAM. DREAM uses the voltage combinations of multiple cells instead of the voltage of one single cell to represent bits. It enables to represent the same bits through flexibly replacing the high voltages in some cells with the low voltages in other cells when free space is available. Hence, high voltages which cause more damage are less used and the lifetime of the MLC memory is extended. Theoretical analysis results demonstrate the effectiveness and efficiency of DREAM. 
    more » « less
  6. Key-value (KV) stores play an increasingly critical role in supporting diverse large-scale applications in modern data centers hosting terabytes of KV items which even might reside on a single server due to virtualization purpose. The combination of ever growing volume of KV items and storage/application consolidation is driving a trend of high storage density for KV stores. Shingled Magnetic Recording (SMR) represents a promising technology for increasing disk capacity, but it comes at a cost of poor random write performance and severe I/O amplification. Applications/software working with SMR devices need to be designed and optimized in an SMR-friendly manner. In this work, we present SEALDB, a Log-Structured Merge tree (LSM-tree) based key-value store that is specifically op- timized for and works well with SMR drives via adequately addressing the poor random writes and severe I/O amplification issues. First, for LSM-trees, SEALDB concatenates SSTables of each compaction, and groups them into sets. Taking sets as the basic unit for compactions, SEALDB improves compaction efficiency by mitigating random I/Os. Second, SEALDB creates varying size bands on HM-SMR drives, named dynamic bands. Dynamic bands not only accommodate the storage of sets, but also eliminate the auxiliary write amplification from SMR drives. We demonstrate the advantages of SEALDB via extensive experiments in various workloads. Overall, SEALDB delivers impressive performance improvement. Compared with LevelDB, SEALDB is 3.42× faster on random load due to improved compaction efficiency and eliminated auxiliary write amplification on SMR drives. 
    more » « less